COVID-19 risk factors amongst 14,786 care home residents: an observational longitudinal analysis including daily community positive test rates of COVID-19, hospital stays and vaccination status in Wales (UK) between 1 September 2020 and 1 May 2021

Abstract Background COVID-19 vaccinations have been prioritised for high risk individuals. Aim Determine individual-level risk factors for care home residents testing positive for SARS-CoV-2. Study design Longitudinal observational cohort study using individual-level linked data from the Secure Anonymised Information Linkage (SAIL) databank. Setting Fourteen thousand seven hundred and eighty-six older care home residents (aged 65+) living in Wales between 1 September 2020 and 1 May 2021. Our dataset consisted of 2,613,341 individual-level daily observations within 697 care homes. Methods We estimated odds ratios (ORs [95% confidence interval]) using multilevel logistic regression models. Our outcome of interest was a positive SARS-CoV-2 PCR test. We included time-dependent covariates for the estimated community positive test rate of COVID-19, hospital inpatient status, vaccination status and frailty. Additional covariates were included for age, sex and specialist care home services. Results The multivariable regression model indicated an increase in age (OR 1.01 [1.00,1.01] per year), community positive test rate (OR 1.13 [1.12,1.13] per percent increase), hospital inpatients (OR 7.40 [6.54,8.36]), and residents in care homes with non-specialist dementia care (OR 1.42 [1.01,1.99]) had an increased odds of a positive test. Having a positive test prior to the observation period (OR 0.58 [0.49,0.68]) and either one or two doses of a vaccine (0.21 [0.17,0.25] and 0.05 [0.02,0.09], respectively) were associated with a decreased odds. Conclusions Care providers need to remain vigilant despite the vaccination rollout, and extra precautions should be taken when caring for the most vulnerable. Minimising potential COVID-19 infection for care home residents when admitted to hospital should be prioritised.


Introduction
Care homes are a keystone of adult social care. They provide accommodation and care for those needing substantial help with personal care, but more than that, they are people's homes [1,2]. In 2016, there were 11,300 care homes in the UK, with a total of 410,000 residents [3]. Within care homes people live in proximity, and may live with frailty and many different health conditions, making them susceptible to outbreaks of infectious disease [1]. COVID-19 is described by Lithander et al., as ' . . . a dynamic, specific and real threat to the health and well-being of older people' (2020, p.10) [4]. The impacts of COVID-19 on this sub-population have been reported widely in both international and UK media, and in a growing peer reviewed literature.
Vaccinations for SARS-CoV-2 in the UK have been prioritised for those identified at higher risk including to older people living in care homes [5,6]. However, there has been variable uptake in vaccination for care home staff [7]. Additionally, very few vaccination trials have recruited older people or older people with frailty [8]. Most studies involving care homes have used data sources at an aggregated residential-level, rather than individual-level. This has included the risk of outbreaks of COVID-19 following hospital discharges [9], along with increased risk of SARS-CoV-2 infection due to differing levels of community prevalence [10]. Individual-level analyses have focussed on mortality due to COVID-19 in care homes, rather than the risk factors for individuals being infected [11][12][13].
This is the first study we are aware of investigating this vulnerable sub-population at an individual-level with the inclusion of the community positive test rate of COVID-19, hospital admissions and vaccination status. Furthermore, we included information on previous positive SARS-CoV-2 polymerase chain reaction (PCR) tests, age, sex and frailty. As suggested in [14] we were able to do this using up-todate linked data on care homes from the Secure Anonymised Information Linkage (SAIL) Databank [15][16][17].

Objectives
We aimed to identify individual-level risk factors for SARS-CoV-2 infection amongst care home residents in Wales (UK) with the inclusion of community positive test rate of COVID-19, hospital admissions and vaccination status.

Study design
Longitudinal observational cohort study using anonymised linked data from the SAIL Databank.

Participants and setting
Our cohort was 14,786 older care home residents (aged 65+) living in Wales between 1 September 2020 and 1 May 2021. Our dataset consisted of 2,613,341 individuallevel daily observations for the same period within 697 care homes. Residents were included if they lived in a care home at any period during the observation window. Only residents with at least one PCR test (positive or negative) were included within the dataset. Residents were censored if they moved out of the care home or died. All data were collected retrospectively and linked anonymously within the SAIL Databank.

Data sources
We used linked longitudinal data from the SAIL Databank to create our dataset [15][16][17]. We used the Welsh Demographic Service Dataset (WDSD) to determine care home residents and residency dates by linking an anonymised residential linkage field to an anonymised care home registry derived from Care Inspectorate Wales (CIW). The WDSD also contains the Lower layer Super Output Area (LSOA) for each address, which is an area containing approximately 1,500 people, as well as demographic information (age and sex). We used the COVID Vaccine Dataset to identify when individuals had received their vaccinations. The Pathology COVID-19 Daily data were used to identify dates of positive and negative SARS-CoV-2 PCR tests. We linked to the Patient Episode Database for Wales (PEDW) to include an indicator for hospital admissions and to calculate the hospital frailty risk score. We used a combination of the Office for National Statistics (ONS) Annual District Death Extract (ADDE), WDSD, and Consolidated Death Data Source for mortality information for censoring.

Outcome-positive SARS-CoV-2 PCR test
Our outcome of interest was a positive SARS-CoV-2 PCR test. The date of the positive test was recorded as the date when the specimen was collected. We used a binary variable to indicate the positive test dates for individuals.

Exposures
Exposure variables were daily COVID-19 community positive test rate estimates and a daily indicator for if a resident was a hospital inpatient. Daily inpatient status was identified using PEDW, residents were identified as being a hospital inpatient for all dates from the admission to the discharge date (inclusive) of a hospital spell.
The estimated COVID-19 community positive test rate was calculated by removing all tests for care home residents COVID-19 risk factors amongst 14,786 care home residents and creating a geospatial model for each daily observation. The model includes a spatial correlation term that decays with distance. In other words, areas that are close in proximity are likely to have a more similar estimate than those further away. This reduces the impact of artificial boundaries introduced by statistical geographies and provides a more realistic spatial distribution estimate of positive test rates in the communities surrounding a care home. The community test rate estimates were calculated for each LSOA using a 14-day lookback window for each date. The testing strategy in Wales in the observation period included asymptomatic testing of those caring for vulnerable people (e.g. care home workers and healthcare workers), and symptomatic testing for those in the community [18]. Our community positive test rate estimate included both symptomatic and asymptomatic testing. The geospatial model used is an extension of the logistic regression model for binomial (numerator/denominator) data, in which the log-odds of the probability, P(x), of at least one positive PCR test is the unobserved realisation of a spatially correlated stochastic process. Specifically, we used the total number of positive PCR tests in the 14-day window as the numerator, and total tests in the 14-day window as the denominator. The model has three parameters that determine the mean and variance of P(x) and the rate at which the correlation between the values of P(x) at two different locations decays with increasing distance between them, for more detail on the methodology see [19].

Predictors and confounders
We included the number of doses of a vaccine and a binary indicator for if an individual had a positive PCR test prior to the observation period as predictor variables. The number of vaccine doses received was time varying and was recorded as a categorical variable on each date (0 doses, 1 dose, 2 doses). Additional variables included were specialist services provided by the care home where individuals were resident during the study period. This included: nursing care (yes/no), learning disability (yes/no), mental health (yes/no), dementia (no or unknown, non-specialist, specialist). Age (continuous), sex (male/female) and hospital frailty risk score (no score, low, intermediate, high) were included at the individual level. The Hospital Frailty Risk Score (HFRS) was developed using Hospital Episode Statistics (HES), a database containing details of all admissions, Emergency Department attendances and outpatient appointments at NHS hospitals in England, and validated on over 1-million older people using hospitals in 2014/15 [20]. The HFRS uses the International Classification of Disease version 10 [21] (ICD-10) codes to search for specific conditions from secondary care. A weight is then applied to the conditions and a cumulative sum is used to determine a frailty status of: low, intermediate or high. We additionally included a HFRS score of 'No score' for people who had not been admitted to hospital in the look back period.
We calculated the HFRS using PEDW, the Welsh counterpart to HES, on each daily observation, with a 2-year look back of all hospital admissions recorded in Wales on each date.

Longitudinal dataset design
In the longitudinal dataset created for the analysis individuals have multiple daily observations. Each daily observation includes updated time-dependent covariates; the estimated community COVID-19 positive test rate, whether an individual was in hospital, and the number of vaccine doses received. All other variables were fixed at the value of entry into the study. Additionally, the dataset has anonymous individual and care home identifiers used to cluster the observations. For an example of the dataset design see Appendix 1 and 2.

Bias
To help minimise selection bias we only included individuals who had at least one SARS-CoV-2 PCR test. Similarly, to minimise time interval bias we used a large observation period where we believe testing of care home residents was more consistent than at the start of the pandemic. We also included anonymous individual and care home level identifiers to account for correlation amongst repeated measures.

Statistical methods
Descriptive statistics included the demographic information for individuals at the start of their residency and stratifications for those who did and did not have a positive PCR test within the observation period. We produced time plots for the daily estimated community positive test rate, positive test rate of residents, numbers of PCR tests for residents and number of positive PCR tests for residents. For our analysis we used multilevel logistic regression models with a random intercept term for each care home. The regression was applied to the individual-level daily dataset. For sensitivity analysis we calculated null (intercept only) models with random effects at the individual level, care home level, and both (see Appendix 2). Observations with a missing LSOA were removed, and individuals were censored if they moved residence or died.

Participants and descriptive data
We analysed 2,613,341 daily observations, consisting of 14,786 care home residents within 697 care homes. The reasons for exclusion from the analysis dataset are included in Figure 1. Descriptive statistics taken from the first observation for each individual are included in Table 1. We observed a high mean age of 85 (standard deviation 8.2), a large proportion of females (69%), and a high proportion of individuals with frailty (61.8% with low, intermediate or high HFRS). To provide an indication of demographic differences we also stratified the variables by those who did and did not have a positive SARS-CoV-2 PCR test at any point within the observation period.
Cross-tabulations for the observations with and without positive tests whilst a hospital inpatient and the number of vaccine doses is displayed in Table 2. Results indicated a high proportion of observations with a positive PCR test had not been vaccinated (96%), and of those with a positive test who were unvaccinated a significant proportion were hospital inpatients (19.7%). A significantly larger proportion of observations with a positive PCR test were identified for hospital inpatients (0.93%) compared to not (0.08%). Figure 2 shows the daily estimated community positive test rate of COVID-19 along with the positive test rate, number of tests taken and number of positive tests for care home residents. The estimated community positive test rate of COVID-19 was largely correlated with the positive test rate amongst care home residents, with peaks in November and January. There was a large decrease in testing and positive tests amongst care home residents after February.

Logistic regression for positive SARS-CoV-2 PCR tests
The univariable and multivariable multilevel logistic regression results for positive SARS-CoV-2 PCR tests are presented in Table 3 The random effects term indicated significant variance at the care home level in the multivariable model (intercept variance of 2.51 with standard error 0.178). We compared intercept only (null) models with random effects terms at the individual, care home, and individual and care home level and found most of the variability was accounted for at the care home level, see Appendix 3.

Discussion
An increase in the community positive test rate of COVID-19 led to an increase in the odds of care home residents testing positive, with an OR of 1.13 (1.12,1.13) per percentage increase of community positive test rate. As the community positive test rate estimates ranged between 0% and 20% in the observed time period this suggests a potential 10-fold increase in the odds of a positive test in care home residents at the peak of community positive test rate. The association between community positive test rate and care home residents having a COVID-19 infection warrants further research with routes of potential ingress needing further exploration to provide evidence and develop robust policies including clear guidelines for staff and visitors. This should also take into consideration qualitative research investigating the overall well-being of care home residents [7,22]. Overall this finding highlights the importance of promoting strategies that maintain lower levels of community positive test rate which help reduce the odds of infection amongst vulnerable populations. For privacy protection the daily total of care home residents with positive tests has been masked (removed) where there were less than five positive tests on a particular date.

J. Hollinghurst et al.
We found care home residents had a large increased odds of testing positive for SARS-CoV-2 whilst in hospital as an inpatient with an adjusted OR of 7.40 (6.54, 8.36). This could be due to the type of hospital care home residents are admitted to, with an estimated 61.9% of COVID-19 infection being hospital acquired in residential community care hospitals, compared to 9.7% in hospitals providing acute and general care [23]. Residents entering a hospital environment may be more likely to be exposed to COVID-19 due to an increased contact with healthcare workers and close proximity to other patients in wards [24]. We also postulate that the increased odds of testing positive could be associated with an increased probability of being PCR tested whilst in hospital compared to a care home. This could also be confounded by the screening methods and guidelines in care homes (such as only testing symptomatic cases) compared to hospitals.
Increased age, sex (male) and increased frailty severity were associated with an increased odds of a positive test in the univariable models. These factors are individual-level characteristics that would possibly require an increased level of care, and therefore increased daily personal contact, potentially increasing the chance of transmission of COVID-19. In the multivariable model sex and frailty were statistically insignificant, suggesting a reduced importance compared to community positive test rate and hospital stays. Those testing positive for SARS-CoV-2 prior to the observation period and individuals with one and two doses of a vaccine were at a reduced odds of a positive PCR test, with adjusted odds ratios of 0.58 (0.49,0.68), 0.21 (0.17,0.25) and 0.05 (0.02,0.09), respectively. Those who had previously tested positive may have built up immunity to COVID-19 in a similar way to receiving a vaccine [25,26]. The crosstabulated results indicated the probability of a positive test was elevated for those in hospital with a vaccine compared to those not in hospital. This suggests although the vaccine may be effective, there is still an increased risk of a positive test for hospital inpatients.

Strengths
We were able to create a large longitudinal dataset of care home residents and include linked individual-level data on hospital admissions, COVID-19 community positive test rate, vaccinations and demographic information. This was all possible using existing data from the SAIL Databank.

Limitations
We observed a change in the testing regime from February, where the total number of tests was reduced. We did not explicitly investigate the impact of the change in testing amongst care home residents but found the care home positive test rate remained consistent in shape and magnitude compared with the estimated community positive test rate. We were unable to link all care homes in Wales in the SAIL databank (91%, 948 of 1,048). We were unable to include details on care home staff who may have tested positive for SARS-CoV-2 as data on care home staff are currently cannot be linked to specific care homes. We restricted our cohort to older adults (aged 65+) living in care homes and those that did not move between care homes within the observation period (temporary residents) which may not be representative of all care home residents.

Conclusion
Our findings indicate that measures taken to prevent COVID-19 from spreading to the most vulnerable in society are not completely effective. Although vaccination profoundly decreased the odds of testing positive for SARS-CoV-2, there was still an increased risk of infection for vaccinated individuals admitted to hospital. We also found that an increased community positive test rate of COVID-19 was associated with an increased odds of infection in care home residents. This may reflect higher risk from visitors or care staff living locally. Achieving and maintaining very high rates of vaccination in residents, health and care staff, and visitors is essential. We suggest that care providers need to stay vigilant despite the vaccination rollout, and extra precautions should be taken when caring for the most vulnerable. Follow-up research is needed to evaluate the impact of vaccine waning and changing prevalence of viral strains.